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Viruses in the families Arteriviridae and Coronaviridae have enveloped virions which contain nonseg- 
mented, positive-stranded RNA, but the constituent genera differ markedly in genetic complexity and 
virion structure. Nevertheless, there are striking resemblances among the viruses in the organization 
and expression of their genomes, and sequence conservation among the polymerase polyproteins 
strongly suggests that they have a common ancestry. On this basis, the International Committee on 
Taxonomy of Viruses recently established a new order, Nidovirales, to contain the two families. Here, 
the common traits and distinguishing features of the Nidovirales are reviewed, © 1997 Academic Press 
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INTRODUCTION 

The Nidovirales (summarized in Table 1) is a newly 
established order comprising the families Arteriviri¬ 
dae (genus Arterivirus) and Coronaviridae (genera 
Coronavirus and Torovirus). Species in the genus Corona- 
virus can be grouped into three clusters on the basis of 
serological and genetic properties (1). Two torovirus 
species have been recognized: the equine and bovine 
toroviruses (ETV, Berne virus; and BoTV, Breda virus). 
In addition, a human torovirus is thought to exist (2) 
and we have recently identified a porcine torovirus 
(PoTV) (Kroneman et al., unpublished). The genus 
Arterivirus presently contains four species. 

Despite considerable differences in genetic complex¬ 
ity and virion architecture, coronaviruses, toroviruses, 
and arteriviruses are strikingly similar in genome 
organization and replication strategy (3) (Fig. 1). The 
name Nidovirales (from the Latin nidus, nest) refers to 
the 3' coterminal nested set of subgenomic (sg) viral 
mRNAs that is produced during infection. Sequence 
similarities, although mostly restricted to the lb poly- 

1 To whom correspondence and reprint requests should be ad¬ 
dressed. Fax: +31-30-2536723. E-mail: R.Groot@vetmic.dgk.ruu.nl. 


protein (POLlb) from which the replicase-associated 
proteins are derived, suggest that the Nidovirales have 
evolved from a common ancestor. Apparently their 
divergence has been accompanied by extensive ge¬ 
nome rearrangements through heterologous RNA re¬ 
combination. 

Here, we review the common traits and distinguish¬ 
ing features of the genome organization, gene expres¬ 
sion, and evolution of the Nidovirales. Other reviews 
are references 3 to 9 and the different models proposed 
for sg mRNA synthesis are discussed in references 8 
to 10. 


VIRION ARCHITECTURE 
AND STRUCTURAL PROTEINS 

The phylogenetic relationship among arteriviruses, 
toroviruses, and coronaviruses is not apparent from 
their morphology. Coronavirions are roughly spheri¬ 
cal, 100-120 nm in diameter, with a fringe of c. 
20-nm-long petal-shaped spikes. Some group II corona¬ 
viruses exhibit a second fringe of smaller surface 
projections about 5 nm in length. Torovirus particles 
are pleiomorphic, measuring 120 to 140 nm in their 
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TABLE 1 

Order: 



Nidovirales 




Family: 

Arteriviridae 




Coronaviridae 



Genus: 

Arterivirus 


Torovirus 


Coronavirus 



Species: 

Equine arteritis virus 

EAV 

Equine torovirus 

ETV 

Transmissible gastroenteritis virus 

TGEV 



Porcine reproductive and respiratory 


Bovine torovirus 

BoTV 

Feline coronavirus 

FCoV 



syndrome virus 

PRRSV 

Porcine torovirus 

PoTV 

Canine coronavirus 

CCV 

I 


Lactate dehydrogenase-elevating 




Human coronavirus 

HCV 229E 



virus 

LDV 



Porcine epidemic diarrhea virus 

PEDV 



Simian hemorrhagic fever virus 

SHFV 



Mouse hepatitis virus 

MHV 







Bovine coronavirus 

BCV 







Human coronavirus 

Porcine hemagglutinating encepha¬ 

HCV OC43 

n 






lomyelitis virus 

HEV 







Sialoacryoadenitis virus 

SADV 







Turkey coronavirus 

TCV 







Infectious bronchitis virus 

IBV 

III 


largest axis; spherical, oval, elongated, and kidney¬ 
shaped virions have been described. The surface projec¬ 
tions on torovirus virions closely resemble coronavirus 
peplomers (11). Arterivirions are only 50-70 nm in 
diameter and lack large surface projections. Instead, 
cup-like structures with a diameter of 10 to 15 nm have 
been observed (12). The difference in virion architec¬ 
ture become even more apparent when comparing the 
nucleocapsid structures. That of coronaviruses is a 
loosely wound helix (13), that of toroviruses is a 
compact tubular structure (11), and that of arterivi- 
ruses is isometric, about 25-35 nm in diameter, and 
possibly icosahedral (12). The nucleocapsid proteins 
(N) differ considerably in size (c. 50,19, and 14 kDa for 
corona-, toro-, and arteriviruses, respectively) and 
amino acid sequence. 

The compositions of the viral envelopes also differ. 
Coronavirus membranes contain: (i) 180- to 220-kDa 
spike protein (S), (ii) 25- 30-kDa triple-spanning mem¬ 
brane protein M, and (iii) c. 10-kDa transmembrane 
protein E, a minor virion component but essential for 
virus assembly (14,15). The small surface projections of 
group II coronaviruses are dimers of a 65-kDa class I 
membrane protein, the hemagglutinin-esterase (HE), 
possibly acquired by heterologous RNA recombina¬ 
tion (16,17). 

Toroviruses also specify M and S proteins of 26 and 
180 kDa, respectively. Although different in sequence, 
the M and S proteins of toro- and coronaviruses are 
alike in size, structure, and function. The M proteins 
have a similar triple-spanning membrane topology 


(18), and the heptad repeats, indicative of a coiled-coil 
structure in the spike proteins of coronaviruses (19), 
are also present in the torovirus peplomer (20). Thus, 
the S and M genes of these viruses may well be phylo- 
genetically related (6,18,20). Puzzlingly, toroviruses 
seem to lack a homologue for the E protein, which 
could indicate a difference in assembly. We have found 
recently that BoTV virions contain a third membrane 
protein, the 65-kDa hemagglutinin-esterase (145). 

The structural proteins of arteriviruses are unrelated 
to those of the Coronaviridae. There is a basic set of 
three envelope proteins (21-24). (i) a 16- to 20-kDa 
nonglycosylated membrane protein (M) which tra¬ 
verses the membrane three times and thus structurally 
resembles the M protein of corona- and toroviruses, (ii) 
a heterogeneously N-glycosylated triple-spanning pro¬ 
tein (designated G L for EAV) of variable size, and (iii) a 
class I glycoprotein of 25-30 kDa (designated G s for 
EAV) which is a minor virion component. The G L and 
M proteins associate into disulphide-linked het¬ 
erodimers and probably form the cup-like structures 
on the virion surface (24-26). 

GENES AND REGULATORY ELEMENTS 

Overall Genome Structure 

Nidoviral genome RNA is single-stranded, infectious, 
polyadenylated (27-29), and, at least for arteri- and 
coronaviruses, 5' capped (30,31). Nucleotide sequences 
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FIG. 1. (a) Scale representation of archetypical Nidovirales genomes. The torovirus genome organization is based on combined data for ETV 

and BoTV, and that of corona virus is typical for a group I member (Table 1). The 5' ends of ORFIb have been aligned. The bottom panel illustrates 
the 3' coterminal nested set of mRNAs produced during coronavirus infection. ORFs are represented by boxes. The coding assignments are also 
indicated. Hatched boxes represent the ORFs for HE and the 30-kDa ns2a protein of group I corona viruses and related sequences in the torovirus 
genome. The arrow indicates the position of the pseudoknot structures required for translational read-through of ORFIb. The 5' leader 
sequences are depicted by a small black box. Poly(A) tails are indicated by A n . (b) Sequence conservation in the POLlb polyproteins. Conserved 
domains are indicated by hatching. RdRp, Zf, and H indicate the RNA-dependent RNA polymerase, zinc finger, and helicase motifs, 
respectively. Domains 1-3 indicate conserved regions for which as yet no function has been suggested. Motif 2 corresponds to the previously 
described CVL domain. Motif 1 has not been described before. Bracketed lines indicate (predicted) proteolytic cleavage sites (for details see text), 
(c) Sequence conservation in motif 1. Sequences were taken from (32,34,36,41,144). Residues conserved between corona- and toroviruses are 
boxed. 
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are known for the complete RNA of coronaviruses 
MHV, IBV, TGEV, and HCV 229E and arteriviruses 
EAV, LDV, and PRRSV (32-39) and for parts of RNA of 
several other Nidovirales, including ETV strain Berne 
(40,41) and SHFV (Godeny et al., in press). The size of 
the arterivirus genome is from 13 to 15 kb. The 
genomes of toroviruses and coronaviruses are consid¬ 
erably larger (up to 31 kb) and include the largest 
known RNA genomes. Despite the differences in ge¬ 
netic complexity and gene composition, the genome 
organizations of arteri-, toro-, and coronaviruses are 
remarkably similar. More than two-thirds of each 
genome are taken up by two huge overlapping open 
reading frames (ORFs), designated ORFla and lb. The 
more downstream, ORFlb, is only expressed after 
translational read-through via a -1 frameshift medi¬ 
ated by a pseudoknot structure (42). The polypeptides 
encoded by these ORFs are proteolytically cleaved by 
virus-encoded proteinases to yield the proteins in¬ 
volved in viral RNA synthesis. 

Downstream of ORFlb, there are four to nine genes 
that encode the structural proteins and, at least for 
coronaviruses, a number of nonstructural proteins. 
These genes are expressed from a 3' coterminal nested 
set of sg mRNAs (8,40,43,44). Although these mRNAs 
are structurally polycistronic, translation is restricted 
to the unique 5' sequences not present in the next 
smaller RNA of the set. Cells infected by arteriviruses 
or coronaviruses contain negative-stranded RNAs 
which correspond to each mRNA and which may 
serve as templates for transcription (45-49). 


Sequence Elements Regulating Transcription 

Each transcription unit (comprising one or more 
genes expressed from a single mRNA species) is 
preceded by a short consensus sequence, the comple¬ 
ment of which is thought to function as a promoter: the 
transcription-associated sequence (TAS) (3,10,50). The 
relative strength of coronavirus promoters is influ¬ 
enced by the primary structure of the TAS (10,50,51) 
and the presence downstream of other TASs. In gen¬ 
eral, downstream TASs have a negative effect on 
transcription levels from upstream sites (52-54). For 
MHV, host proteins of 35 and 38 kDa have been 
identified that specifically bind to the TAS and may 
serve as transcription factors (9,55,56). 

The sg mRNAs of corona- and arteriviruses carry a 
5' leader sequence of 55-92 and about 200 nt, respec¬ 
tively, which are derived from the 5' ends of the viral 
genomes. The mRNA synthesis thus requires, at least 


at one point, a discontinuous transcription event (43,44). 
The fusion of "leader" and "body" sequences occurs 
within or in close proximity to the TAS (10,49,57,58). 
Puzzlingly, the torovirus mRNAs seem to lack an 
extensive 5' leader sequence (40,59). Thus if the use of 
a leader sequence evolved before the divergence of the 
Nidovirales, toroviruses must have lost their leader 
relatively recently. The close evolutionary relationship 
between toro- and coronaviruses suggests that this 
event took place after the Coronaviridae and Arteriviri- 
dae diverged. Alternatively, the common ancestor of 
the Nidovirales may have used a leader-independent 
transcription mechanism and arteri- and coronavi¬ 
ruses acquired a 5' leader independently. In either 
view, the addition of noncontiguous leader sequences 
would not be a mechanistically important aspect of 
mRNA synthesis (as suggested by the "leader-primed" 
transcription model) (8) but rather a modification of a 
common transcription scheme, based primarily on 
transcriptase-promoter recognition (9,60). What then 
is the function of the leader sequence? Perhaps the 
discontinuous transcription seen in arteri- and corona¬ 
viruses has evolved merely to provide each viral 
mRNA with a translational enhancer, allowing efficient 
competition with host mRNAs for the cellular transla¬ 
tional machinery. Indeed, there is evidence that the 
coronavirus leader sequence stimulates viral transla¬ 
tion in cis, possibly in conjunction with a virus- 
specified or virus-induced factor (61). 

For a complete understanding of Nidovirales tran¬ 
scription-initiation, studies on torovirus mRNA synthe¬ 
sis will be pivotal. In fact, the existence of a small 
torovirus leader RNA cannot entirely be excluded. 
Sequence analysis of ETV defective interfering RNAs, 
combined with results of primer extension studies, 
suggest that a TAS is present at the extreme 5' end of 
the viral genome which could give rise to a leader of 
approximately 8 nt (59). 


5' and 3' Nontranslated Regions 

The promoters required for genome replication are 
commonly found at the 5' and 3' ends of the genome. 
Coronaviruses have nontranslated regions (NTRs) rang¬ 
ing from 0.2 to 0.5 kb (5') and from 0.3 to 0.5 kb (3'). 
Their primary structure is poorly conserved among the 
different subgroups. Deletion mapping studies using 
synthetic DI RNAs suggest that for the group II 
coronaviruses, about 0.5 kb of each end of the genome 
is required for replication, implying that promoter 
elements may extend into ORFla and the N gene 
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(10,62-66). All coronavirus genome RNAs have the 
sequence 5' U/GGGAAGAGC 3' about 70 nt upstream 
of the poly(A) tail (67,68). The strict conservation of 
this sequence element suggests that it has a role in 
replication. Surprisingly, however, the 3' most 55 nt of 
the 3' NTR of MHV appear to be sufficient to drive 
minus-strand synthesis (69). 

The 3' NTRs of toroviruses are about 0.3 kb. The 5' 
NTR of ETV strain Berne is 0.8 kb (59) but the lengths 
of 5' NTRs of other toro viruses are unknown. The 5' 
NTRs of arteriviruses are about 0.2 kb and, unlike 
those of coronaviruses, consist almost entirely of the 
leader (37-39,70). The 3' NTRs of arteriviruses are also 
short, ranging from 59 to 151 nt, and conserved 
sequence elements have not been found. 

POLYPROTEIN PROCESSING: 

THE POLYMERASE GENE 

The overlapping ORFs la and b found at the 5' end 
of the nidoviral genome are frequently referred to as 
the "polymerase gene." However, there is little doubt 
that the processing of the encoded polyproteins yields 
proteins required for RNA synthesis as well as a 
number of products involved in other aspects of virus 
replication. The la and lb polyproteins of coronavi¬ 
ruses are 3951 to 4492 and 2682 to 2714 residues long, 
respectively. POLlb of ETV strain Berne consists of 
2289 residues; only limited sequence data are available 
for torovirus ORFla. The polyproteins of arteriviruses 
are much smaller, with lengths of 1727-2396 (POLla) 
and 1411-1459 (POLlb) residues. 

Amino acid sequence comparisons show that the lb 
polyproteins of corona-, toro-, and arteriviruses are 
basically colinear (37,41) (Fig. lb). The sequence conser¬ 
vation between the more closely related corona- and 
toroviruses is clustered in six domains, four of which 
are also found in the arterivirus POLlb: the "classical" 
RNA-dependent RNA polymerase (RdRp) and heli- 
case (H) domains, which are also present in the 
polymerases of most other viruses, a zinc finger motif 
(zf), and a short region of 80-100 residues, which has 
not yet been identified in other viral polymerases and 
was called the "coronavirus-like" (CVL) domain (3) 
(motif 2 in Fig. lb). 

Processing of Coronavirus POL1A Polyproteins 
by Papain-like Proteinases 

There is little sequence conservation among the 
N-termini of the POLla polyproteins of the three 


coronavirus subgroups. Size differences can mostly be 
attributed to these regions (Fig. 2) and sequence 
similarities are limited to papain-like cysteine protein¬ 
ase (pep) domains (33,34,36,71). POLla of HCV 229E, 
TGEV, FIPV (subgroup I), and MHV (subgroup II) 
have two pep domains, whereas that of IBV (subgroup 
III) contains a single pep domain. These pep seem to be 
involved in the processing of the N-termini of the la 
polyproteins. 

The proteolytic cleavage of the N-terminus of the 
coronavirus la polyprotein has been studied in most 
detail for MHV. In vitro translation of genomic RNA 
gave products of 28 and 220 kDa and the production of 
p28 was sensitive to proteinase inhibitors, suggesting 
that it arose by a proteolytic cleavage(72). p28 was also 
detected in MHV-infected cells (73). Partial peptide 
mapping revealed that p28 is derived from the N-termi¬ 
nus of POLla (74). Baker et al. (75) subsequently showed 
that the proteolytic activity responsible for the produc¬ 
tion of p28 mapped to residues 1223-1695 of POLla 
which contains the N-terminal-most pep domain (pcpl) 
(33). Mutagenesis showed that any change of either 
Cys 1137 or His 1288 (Cys 1121 and His 1272 of MHV-A59) 
(35,76) resulted in the loss of proteinase activity, 
suggesting that these residues form the catalytic dyad 
(77). Cleavage to give p28 was at an RGV motif at the 
Q247/y248 dipeptide bond (78,79), and presumbably 
occurred in cis (75). Reactions of specific antisera raised 
against different regions of MHV POLla with potential 
cleavage products with apparent molecular weights of 
65, 50, 240, and 290 kDa in MHV-infected cells (80,81) 
showed that processing of the N-terminus of POLla 
involves multiple cleavage events. p65 is thought to be 
immediately adjacent to p28 (81,82). Gao et al. (82) 
reported that p65 of MHV strain JHM is generated 
from a p72 precursor, but this precursor has not been 
observed by others studying MHV strain A59 (81). 
Kinetic analysis suggests that p290 is a precursor to 
p50 and p240. A provisional map of the POLla region 
of MHV is shown in Fig. 2. The proteinases involved in 
the release of p65, p50, and p240 have not yet been 
identified. Although some authors have implicated 
pcpl in the cleavage of p65 (76) this is disputed by 
others (82). 

Only limited data are available on the processing of 
the N-terminus of POLla of IBV. Using monospecific 
antisera raised against residues 49-514 or 247-599, Liu 
et al. (83) detected a 87-kDa product in IBV-infected 
cells. It is not known if IBV p87 represents the N-termi- 
nal cleavage product or if an additional smaller prod¬ 
uct is released from the N-terminus of POLla. p87 was 
also found upon in vivo expression of the N-terminal 
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FIG. 2. Proteolytic processing of the coronavirus polyproteins POLla and POLlab. Provisional cleavage maps were constructed on the basis of 
the combined data discussed in the text. POLla and POLlb sequences are indicated by boxes. Papain-like (pep) and 3C-like cysteine proteinase 
(3clp) domains are indicated by shading, as are the RNA dependent RNA polymerase (RdRP), zinc finger (Zf), and helicase domains (H). Also 
shown are the hydrophobic domains, mpl and mp2, that flank 3clp. Cleavage sites that have been identified experimentally either by protein 
sequence analysis or by site-directed mutagenesis are indicated by black arrows. White arrows indicate cleavages for which the exact cleavage 
site has not been determined. Cleavage products are designated after their apparent molecular weight as determined by SDS-PAGE. Proteinases 
involved in each cleavage event are given. Question marks indicate cleavages for which the proteinase has not yet been identified. Open 
arrowheads indicate predicted cleavage sites for 3clp. 


1742 residues of IBV POLla (83), which include the 
pep domain (33,71). Interestingly, p87 was not detected 
after in vivo expression of a shorter N-terminal polypep¬ 
tide of 1444 residues that lacked pep, strongly suggest¬ 
ing that pep is involved in the release of this product. 
Because p87 did not appear when the 1742-residue 
polypeptide was produced by in vitro translation, 
cellular factors may also be involved in this cleavage 
event. However, in vivo processing of this polypeptide 
was also inefficient, possibly because the pep is located 
at the C-terminus of the 1742-residue expression prod¬ 


uct and sequences downstream of this domain are 
required for optimal proteolytic activity. 

In our laboratory, a monospecific antiserum, raised 
against the N-terminal 198 residues of the la polypro¬ 
tein of FIPV, specifically recognized products of 12, 83, 
and 100 kDa in FlPV-infected cells. These products were 
also found upon in vivo expression of the N-terminal 
1446 residues of FIPV POLla containing the pcpl 
domain. Kinetic analysis suggested that pl2 and p83 
are mature products with plOO as their precursor. pl2 
reacted with antiserum raised against the N-terminal 
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15 residues of POLla, showing it to be the N-terminal- 
most cleavage product, pcpl appears to be involved in 
the release of both pl2 and p83. Substitution of the 
presumptive catalytic cysteine residue of pcpl (Cys 1117 ), 
completely abolished proteolytic activity (Fig. 2; De 
Groot et ah, in preparation). 


Processing of the Coronavirus Polyproteins 
la and lab by the 3C-like proteinase 

In contrast to the N-termini, the C-terminal third of 
coronavirus POLla polyproteins are well conserved. 
All contain a proteinase domain flanked by hydropho¬ 
bic regions, designated mpl and mp2 (Fig. 2). This 
proteinase is related to the chymotrypsin-like serine 
proteases, but with a cysteine rather than a serine 
residue as the active site nucleophile (33,34,36,71,84). A 
similar situation exists in the 3C proteinases of picorna- 
viruses and 3C-like proteinases of plant viruses (85). 

The 3C-like proteinases (3clp) of coronaviruses are 
involved in the processing of the C-terminus of POLla 
and of POLlab. The results obtained for IBV, MHV, 
and HCV 229E differ only in details. The 3clp mediates 
at least four cleavage events. It autocatalytically ex¬ 
cises itself from the polyprotein precursor, yielding 
products of 35,27, and 34 kDa for IBV, MHV, and HCV 
229E, respectively (86-89) (Fig. 2). The release of IBV 
3clp (but not that of MHV) from a synthetic precursor 
in vitro was dependent on the presence of microsomal 
membranes and apparently required membrane- 
association of the flanking lipophilic domains (86,87). 
Lu et al. (88) proposed that because production of the 
MHV p27 in vitro was sensitive to dilution, the autocata- 
lytic release of 3clp occurs mainly in trans. Protein 
sequence analysis identified Q 3333 /S 3334 and Q 2965 /A 2966 
as the respective N-terminal cleavage sites of MHV 
p27 and HCV p34 with the Gin residues in the PI 
position (87,89). p35 of IBV is generated by cleavage of 
QS dipeptides at positions 2779-2780 and 3086-3087 
(86). The cleavage sites flanking 3clp are well con¬ 
served among the different coronaviruses. 

Processing of the POLlab polyprotein by 3clp also 
resulted in the production of a polypeptide of c. 100 
kDa, containing the RdRp domain (90-92). The cleav¬ 
age sites for IBV and HCV 229E were at the position- 
ally conserved dipeptides Q3928/g3929 an( q Q4868/g4S69 or 
Q4068 / g4069 anc j Q 4995 / a 4996 , respectively, the N-terminal 
most of which are located in POLla (Fig. 2). Processing 
leading to the release of the RdRp can occur in trans, 
both in vitro and in vivo (91,92). 

Gorbalenya et al. (71) predicted that the catalytic site 


of the IBV 3clp consists of a triad formed by His 2820 , 
Glu 2843 , and Cys 2922 . The Cys and His residues are 
conserved in the 3clp domain of the other coronavi¬ 
ruses and their involvement in proteolysis has been 
confirmed by site-directed mutagenesis (86,87,89,91). 
Glu 2843 is not part of the catalytic site. This residue is 
not conserved in other 3clp and substitution by Asn, 
Asp, or Gin did not affect proteolytic activity (91). In 
agreement with the assumed evolutionary relationship 
with cellular trypsin-like serine proteases, the corona¬ 
virus 3clp are sensitive to both serine and cysteine 
protease inhibitors (86,88). Moreover, substitution of 
the active site Cys by Ser yielded an IBV 3clp which 
was still partially active (86). 

The cleavage sites of the coronavirus 3clp conform to 
the consensus XQZ, with X being a hydrophobic 
residue (L, V, I, M or F) and Z a small uncharged 
residue (S, A, G or C). These data provide experimental 
support to earlier predictions (33,71). Alignment of 
POLlab sequences suggests that 3clp may cleave at 
seven additional conserved sites (Fig. 2). Cleavage at 
the sites in MHV POLla would produce four extra 
polypeptides with predicted molecular weights of 33, 
10, 34, and 15 kDa. The 33-kDa product would contain 
the hydrophobic domain mp2, whereas the 15-kDa 
product would be a cysteine-rich polypeptide resem¬ 
bling murine epidermal growth factor in sequence 
(71). Processing of POLlb would yield the RdRp and 
four other products. The zinc finger and helicase 
motifs would be in a product of about 67 kDa and the 
conserved motif 1 would be in a polypeptide of 59 
kDa, whereas motifs 2 (the CVL domain) and 3 would 
be in products of 42 and 33 kDa, respectively (Figs. 1 
and 2). The latter may correspond to a 33-kDa protein 
in lysates of MHV-infected cells which reacted with 
antiserum against the 14 C-terminal amino acids of 
POLlb (93). 


Processing of the Arterivirus Polymerase 
Polyproteins 

Most of what is known about arterivirus polyprotein 
processing stems from the work of Snijder and col¬ 
leagues on EAV; only limited information is available 
for PRRSV and LDV. As for coronaviruses, most 
sequence variation occurs in POLla. Processing of the 
N-terminus of POLla is mediated by papain-like 
cysteine proteinases, whereas the C-terminus of POLla 
and the conserved lb polyprotein is probably pro¬ 
cessed by a 3C-like proteinase which is located at the 
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C-terminus of POL la and flanked by hydrophobic 
domains (Fig. 3). 

For both PRRSV and LDV (38,39), the N-terminus of 
POLla contains two papain-like proteinase domains, 
pcpct and pep (3, which mediate their own release by 
cleavage in cis at C-terminal cleavage sites, giving rise 
to products nsPla and nsPl (3 (Fig. 3) (94). The PRRSV 
and LDV leader proteinases share 48% sequence iden¬ 
tity. For PRRSV, Cys 76 and His 146 are crucial for pepa 
activity (94), whereas cleavage by pcp|3 was dependent 
on Cys 276 and His 345 . For LDV, Cys 76 and Cys 269 were 
identified as active site cysteines. The cleavage sites in 
POLla have not been mapped but from the sizes of 
nsPla and nsPl (3, and from the results of deletion 
analyses, are predicted to be around position 170 for 
pcpct and between Tyr 384 and Gly 385 for PRRSV pcp|3, 
and between Tyr 380 and Gly 381 for LDV pcp|3. 

EAV is thought to have a single leader proteinase 
(37), corresponding to pep (3 of LDV and PRRSV. 


However, relicts of nsPla are still present in the 
N-terminus of EAV POLla (94). The EAV pep(3 homo- 
logue releases a 29-kDa protein, nsPl 95 (Fig. 3), appar¬ 
ently exclusively by cleavage in cis at G 260 /G 261 . The 
results of site-directed mutagenesis suggested that 
Cys 164 and His 230 form the catalytic dyad (95). 

Four additional mature cleavage products were 
identified in lysates of EAV-infected cells (96) and were 
designated nsP2 to 5 (Fig. 3). The 61-kDa nsP2 protein 
is released by cleavage between Gly 831 and Gly 832 and 
the catalytic activity responsible is within the N-termi- 
nal 165 residues of nsP2 as this domain can induce 
cleavage at the 2/3 site in trnns (97). Sequence compari¬ 
sons suggested that the catalytic residues in the cyste¬ 
ine proteinase domain were Cys 270 and His 332 . Substitu¬ 
tions of these residues completely abolished proteolytic 
activity, but so did replacement of three other con¬ 
served cysteine residues (positions 319, 349, and 354). 
The N- and C-terminal sequences of nsP2 are highly 
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FIG. 3. Proteolytic processing of the arterivirus polyproteins POLla and POLlab. The (provisional) cleavage maps were constructed on the 
basis of the combined data discussed in the text. POLla and POLlb sequences are indicated by boxes. POLla cleavage products are numbered 
according to Snijder et al. (96). Also shown are the apparent molecular weights of the cleavage products. The papain-like proteinase domains 
(pep) and the nsP2 cysteine (cp) and the nsP4 serine proteinases (sp) are indicated by shading as are the RNA-dependent RNA polymerase 
(RdRp), zinc finger (Zf), and helicase domains (H). Also shown are the hydrophobic domains, mpl and mp2, that flank nsP4. Cleavage sits that 
have been identified experimentally are indicated by black arrows. White arrows indicate cleavages for which the exact cleavage site has not yet 
been determined. Cleavages performed by the serine proteinase are given. Arched arrows depict cleavages performed by the leader proteinases. 
Open arrowheads indicate predicted sp cleavage sites, black arrowheads mark cleavages possibly performed by a cellular proteinase. 
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conserved among EAV, LDV, and PRRSV. In contrast, 
the middle portions differ markedly in size (210-670 
residues) and sequence (37-39) (Fig. 3), suggesting that 
nsP2 has species-specific rather than genus-specific 
functions (94). Multiple sequence alignments suggest 
that the nsP2 / nsP3 cleavage sites for LDV and PRRSV 
are Gly-Gly at positions 1286/1287 and 1462/1463, 
respectively. 

Inhibition of cleavage at the nsP2/3 junction abol¬ 
ishes downstream proteolytic events, which are prob¬ 
ably all mediated by a 3C-like serine protease (sp) (98) 
located within nsP4. Site-directed mutagenesis results 
suggest that the catalytic triad of the nsP4 protease 
comprises His 1103 , Asp 1129 , and Ser 1184 , while Thr 1179 and 
His 1198 maybe involved in substrate recognition. Snijder 
et al. (98) further identified three cleavage sites within 
POL la (E 1064 /G 1065 , E 1268 /S 1269 , and E 1677 /G 1678 ) and two 
additional cleavage sites were proposed in the C-termi- 
nus of POL la (99). The corresponding cleavage sites in 
LDV and PRRSV in Fig. 3 are inferred. 

Three putative recognition sequences for the nsP4 
protease were predicted in POLlb. Proteolytic cleav¬ 
age at these sites would separate the RdRp motif from 
the putative metal binding and helicase domains. 
Reaction with specific antisera detected four possible 
cleavage products designated p80, p50, p26, and pi2, 
respectively (Fig. 3), and a number of putative precur¬ 
sor proteins in lysates of EAV-infected cells (99). The 
most N-terminal cleavage product, p80, contains the 
RdRp domain, and the putative zinc finger and heli¬ 
case motifs are in the adjacent p50. The CVL domain 
(motif 3; Fig. lb) is in p26. 


Nidovirales Polyprotein Processing: 

Differences and Common Concepts 

No information is available on the processing of 
POLlb of toroviruses, although the sequence contains 
a number of potential 3clp- cleavage sites. Because the 
POLlb sequences of toro- and coronaviruses are colin- 
ear (Fig. lb), the processing of torovirus POLlb is 
likely to be very similar to that of coronaviruses. There 
are some marked differences between Coronaviridae 
and Arteriviridae. The latter lack a cleavage product 
containing motif 1 (Figs lb and lc). Moreover, it 
remains to be seen whether the C-terminal POLlb 
cleavage products of the Arteri- and Coronaviridae are 
functionally equivalent. 

For the arteri- and coronaviruses, POLlb processing 
would yield a product containing both the helicase 
domain and the zinc finger motif. Such a combination 


is rare, but not unprecedented as it has also been seen 
in glh-1, a putative RNA helicase from Caenorhabditis 
elegans (100), and the (putative) yeast RNA helicases 
Yerl76W (101) and NAM7 (102,103). Most helicases 
lack zinc finger motifs, and it is therefore unlikely that 
the zinc fingers are required for helicase activity (100). 
Perhaps, they may confer sequence specificity, for 
example, in promoter recognition. 

GENES EXPRESSED FROM 
SUBGENOMIC mRNAs 

ORFs and Coding Assignments 

The arteriviruses PRRSV, LDV, and EAV each pos¬ 
sess six genes, numbered 2-7 from the 5' end, that are 
expressed from subgenomic mRNAs (37-39,44). These 
ORFs usually overlap (Fig. la). ORFs 2, 5, 6, and 7 are 
conserved among all arteriviruses and, using EAV 
terminology, code for G s , G L , M, and N, respectively 
(21,22,24,104,105). Sequence similarity can be detected 
only at the amino acid level; the conservation is 
generally low and, especially in the EAV proteins, 
restricted to short domains. ORFs 3 and 4 are con¬ 
served among PRRSV, LDV, and SHFV and code for 
membrane glycoproteins, which in the case of PRRSV, 
are present in purified virions (106,107). The ORF4 
product of EAV shares no obvious sequence similarity 
with that of the other arteriviruses and has not been 
detected in virus preparations. Surprisingly, SHFV 
possesses three additional ORFs. From the limited 
sequence similarities and the apparent positional con¬ 
servation of cysteine residues it appears that these 
ORFs have arisen from a heterologous RNA recombina¬ 
tion event by which ORFs 2-4 were duplicated (E. 
Godeny, personal communication). 

Toroviruses apparently express only four genes from 
subgenomic mRNAs, all of which encode structural 
proteins. ETV and BoTV are genetically and serologi¬ 
cally closely related and share 84% sequence identity 
in the 3'-most 3 kb of their genomes (145). PoTV is 
more distant as judged from the sequence of its 
nucleocapsid protein, which is only 68% identical to 
those of the other two viruses (Kroneman et al, 
unpublished). Snijder et al. (108) noted the presence of 
a small ORF completely contained within the N gene 
of ETV. This ORF, which would encode a hydrophobic 
polypeptide of approximately 10 kDa, is conserved in 
BoTV but abrogated by a termination codon in PoTV. 

Coronaviruses possess up to nine ORFs that are 
expressed from sg mRNAs. Of these, the genes for only 
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the main structural proteins are conserved among the 
three subgroups (sequence identities of approximately 
30%) as is their relative position in the genome (5' 
S-E-M-N 3'). Apparently, as coronaviruses diverged, 
subgroup-specific sets of accessory genes were ac¬ 
quired (5,7,109). For instance, the HE gene and ORF2a, 
which encodes a cytoplasmic nonstructural phospho- 
protein of about 30 kDa (16,110,111) (Fig. 1), are only 
found in group II viruses. Differences in gene composi¬ 
tion occur even among viruses of the same subgroup. 
In CCV and FCoV, ORFs 7a and 7b are at the 3' end of 
the genome (112,113), but TGEV, which is serologically 
and genetically very closely related to CCV and FCoV, 
lacks 7b (67). HCV 229E lacks both ORFs (68). 

All accessory genes tested thus far are dispensible 
for replication in vitro and in vivo (16,114—119). The 
functions of the encoded proteins are poorly under¬ 
stood, but at least some may be involved in virus-host 
interactions and thus contribute to viral fitness. For 
example, the 7b gene of FCoV codes for a nonstruc¬ 
tural 26-kDa secretory glycoprotein (120). FCoV vari¬ 
ants that lack ORF7b readily arise in tissue culture, but 
among naturally occurring FCoV strains, the gene is 
strictly maintained and its loss correlates with reduced 
virulence (118). 

In contrast to the other Nidovirales, a number of 
coronaviruses have polycistronic mRNAs which con¬ 
tain up to three ORFs clustered in a single transcription 
unit. Downstream ORFs are usually translated by 
leaky scanning but the synthesis of the E proteins of 
IBV (ORF 3c) and MHV (ORF 5b) may involve internal 
intiation of translation mediated by a ribosomal land¬ 
ing pad (5,121-123). The N gene of some group II 
coronaviruses contains a small internal ORF in the +1 
reading frame (Fig. 1) that is expressed in infected cells 
(24,125). It encodes a hitherto unrecognized structural 
protein that is not essential for virus replication in vitro 
and in vivo (119). 


RNA Recombination: A Driving Force 
in Nidovirales Evolution 

The variation in coronavirus gene composition is 
probably the result of heterologous RNA recombina¬ 
tion events during which gene modules (126) were 
obtained either from nonrelated viruses or from the 
host. The most compelling example is the HE gene, the 
product of which is 30% identical to the N-terminal 
subunit of the hemagglutinin-esterase fusion protein 
(HEF) of influenza C virus (ICV) (16). Heterologous 
RNA recombination events must also have taken place 


during torovirus evolution. A 0.5-kb remnant of an HE 
gene was found in the ETV genome (20) and an intact, 
functional HE gene of 1.2 kb is present in the genome 
of BoTV (Fig. 1; 145). The torovirus HE protein shares 
30% sequence identity with both the influenza C virus 
HEF and the coronavirus HE. In addition, sequences 
related to ORF2a of group II coronaviruses were found 
at the 3' end of ETV ORFla (20) (Fig. 1). The HE and 
the ORF2a-related sequences found in corona- and 
toroviruses were probably not inherited from a com¬ 
mon ancestor, but acquired through separate heterolo¬ 
gous RNA recombination events (6,20) because (i) the 
genes are in different positions in the two virus 
genomes (Fig. 1) and (ii) it is highly unlikely that genes 
retained during the considerable evolutionary diver¬ 
gence between corona- and toroviruses would have 
been lost from the genomes of coronavirus subgroups I 
and III. 

The differences among the main structural proteins 
of the Nidovirales could also be explained by heterolo¬ 
gous recombination (3). A switch from an arterivirus- 
like isometric nucleocapsid structure to the extended 
helical nucleocapsid structures of the Coronaviridae 
may have been a determining step in the divergence of 
the Nidovirales (38). Removal of constraints on ge¬ 
nome size would have allowed toro- and coronavirus 
ancestors to acquire large genomes and thus develop 
the variation in gene composition seen today. A rela¬ 
tively recent replacement of the N gene may subse¬ 
quently have led to the divergence of the toro- and 
coronaviruses. 

Homologous RNA recombination (128,129) may also 
be an important force in Nidovirales evolution. High 
frequency recombination of coronavirus genomes has 
been observed in tissue culture (130,131), in experimen¬ 
tally infected animals (132) and in embryonated eggs 

(133) . Homologous recombination allows the rapid 
exchange of beneficial mutations and also serves as a 
correction mechanism counteracting Muller's ratchet 

(134) . There is evidence that homologous recombina¬ 
tion occurs in IBV genomes in the field (135,136,146) 
and a genetic exchange between CCV and FCoV 
serotype I strains may have resulted in the emergence 
of a new FCoV serotype (118,137,138). 

CONCLUDING REMARKS AND FUTURE 
PERSPECTIVES 

The nidoviral replicase module has given rise to 
viruses that utilize similar replication strategies and 
yet differ markedly in genetic complexity. Common to 
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the Nidovirales is the use of a nested set of mRNAs. 
This property, often regarded as "unique," is shared 
with the phylogenetically unrelated closteroviruses, a 
genus of filamentous RNA viruses of plants (139,140). 
Closteroviruses have genomes of up to 20 kb in length, 
thus approaching the Coronaviridae in genetic com¬ 
plexity They also resemble the Nidovirales in genome 
organization and expression, including the use of large 
polymerase polyproteins, encoded by two overlapping 
ORFs located at the 5' end of the genome, and 
down-regulation of RdRp synthesis by ribosomal 
frameshifting. These recent findings underscore the 
power of convergent evolution and indicate that simi¬ 
larities in genomic organization and common mecha¬ 
nisms of gene expression and regulation are not 
reliable taxonomic criteria by themselves. Even the 
results of comparative sequence analysis should be 
regarded with caution. Alignments of RdRp domains 
have been presented to illustrate the evolutionary 
relationship between the Nidovirales, but the phyloge¬ 
netic signal in this domain is not sufficient to support a 
common ancestry of corona- and arteriviruses (141). 
Here, the toroviruses provide the "missing link" and 
thus justify a phylogenetic grouping of corona-, toro-, 
and arteriviruses (141) (R Zanotto, personal communi¬ 
cation). 

The analyses of Nidovirales genomes and the stud¬ 
ies on polyprotein processing have led to the identi¬ 
fication of many viral proteins, some of which are 
conserved and some of which are genus- or even 
species-specific. The next formidable task will be to 
determine the function of each of these products. What 
is the added value of the nonconserved POLla- 
derived cleavage products? Are they antagonists of the 
intracellular antiviral response or involved in host shut 
off? What are the functions of the proteins derived 
from POLlb? Why are proteins containing motifs 1 
and 3 lacking in arteriviruses and what are the conse¬ 
quences for replication and transcription? Are replica¬ 
tion and transcription distinct processes? Is there a 
developmental shift from replication to transcription 
and if so, how is this regulated? What is the function of 
the various accessory genes of coronaviruses and how 
do they contribute to viral fitness? Many of these 
questions may well be solved in the near future. Both 
in Leiden (147) and in Utrecht (Glaser et cil ., in prepara¬ 
tion), full-length cDNA clones of the EAV genome 
have been constructed, from which infectious tran¬ 
scripts can be derived. For coronaviruses, no such 
clones are yet available. However, homologous RNA 
recombination can be exploited to introduce site- 
specific mutations into the viral genome using syn¬ 


thetic (DI) RNAs as donor sequence (65,119,142,143). 
Targeted RNA recombination provides an attractive 
strategy to characterize the various fs-mutants of MHV 
(65). Undoubtedly, the recent development of methods 
to study arteri- and coronaviruses by reverse genetics 
heralds a new era in Nidovirales research. 
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